Q-Networks for Binary Vector Actions
نویسنده
چکیده
In this paper reinforcement learning with binary vector actions was investigated. We suggest an effective architecture of the neural networks for approximating an action-value function with binary vector actions. The proposed architecture approximates the action-value function by a linear function with respect to the action vector, but is still non-linear with respect to the state input. We show that this approximation method enables the efficient calculation of greedy action selection and softmax action selection. Using this architecture, we suggest an online algorithm based on Q-learning. The empirical results in the grid world and the blocker task suggest that our approximation architecture would be effective for the RL problems with large discrete action sets.
منابع مشابه
Prediction of true critical temperature and pressure of binary hydrocarbon mixtures: A Comparison between the artificial neural networks and the support vector machine
Two main objectives have been considered in this paper: providing a good model to predict the critical temperature and pressure of binary hydrocarbon mixtures, and comparing the efficiency of the artificial neural network algorithms and the support vector regression as two commonly used soft computing methods. In order to have a fair comparison and to achieve the highest efficiency, a comprehen...
متن کاملGNG-Based Q-Learning
In this paper, we present a new developmental architecture that joins the categorizational power of Growing Neural Gas networks with an action policy for discrete states and actions. The result is a robot brain that can choose its next move by associating its current sensor inputs with a particular subsection of the possible input vectors. GNG networks are used for vector quantization, to gener...
متن کاملApplication of Artificial Neural Networks in a Two-step Classification for Acute Lymphocytic Leukemia Diagnosis by Blood Lamella Images
Introduction: This study aimed to present a system based on intelligent models that can enhance the accuracy of diagnostic systems for acute leukemia. The three parts including preprocessing, feature extraction, and classification network are considered as associated series of actions. Therefore, any dysfunction or poor accuracy in each part might lead in general dysfunction of...
متن کاملNeural Network Performance Analysis for Real Time Hand Gesture Tracking Based on Hu Moment and Hybrid Features
This paper presents a comparison study between the multilayer perceptron (MLP) and radial basis function (RBF) neural networks with supervised learning and back propagation algorithm to track hand gestures. Both networks have two output classes which are hand and face. Skin is detected by a regional based algorithm in the image, and then networks are applied on video sequences frame by frame in...
متن کاملFacial expression recognition based on Local Binary Patterns
Classical LBP such as complexity and high dimensions of feature vectors that make it necessary to apply dimension reduction processes. In this paper, we introduce an improved LBP algorithm to solve these problems that utilizes Fast PCA algorithm for reduction of vector dimensions of extracted features. In other words, proffer method (Fast PCA+LBP) is an improved LBP algorithm that is extracted ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1512.01332 شماره
صفحات -
تاریخ انتشار 2015